Supplemental Material: Perception of Perspective Distortions in Image-Based Rendering
نویسندگان
چکیده
This document contains supplemental information and clarifications for the paper Perception of Perspective Distortions in Image-Based Rendering by Vangorp et al. [2013], published in ACM Transactions on Graphics 32, 4 (Proceedings of ACM SIGGRAPH 2013). 1 Extended Retinal Hypothesis 1.1 Frontoparallel Capture The derivation of the extended retinal hypothesis assumes frontoparallel capture for simplicity. If the capture was not frontoparallel, an additional coordinate system transformation would be required because the façade features would no longer be aligned to the capture camera axes. However, the derivation for the general case would lead to identical final equations. Thus, the viewing direction of the capture camera does not affect the predictive model. This can be seen intuitively by considering that we first capture an image and project it back onto the proxy geometry using standard perspective projection. However, this back-projection also works for capture cameras that are not aligned to the façade, or even using cylindrical or spherical projections for 360° panoramas (for which “frontoparallel” is not even defined). The remainder of the derivation only requires that the center of projection of the capture camera be defined. Eccentricity should be measured from the point on the façade that is closest to the capture camera’s center of projection. For a frontoparallel capture camera, that point happens to be at the center of the captured image. 1.2 Simulation Camera Derivation The following transformation changes from capture to simulation camera coordinates: xs = (xc − c · tan θe) · cos θs + (zc − c) · sin θs (1) ys = yc (2) zs = (−xc + c · tan θe) · sin θs + (zc − c) · cos θs + d (3) Using this transformation, and Equation 6 from the main document, we obtain the projected camera coordinates of the vanishing point for the front face: xs = − sign θs · lim t→∞ t · cos θs + [−c · tan θe cos θs] (4) ys = 0 (5) zs = sign θs · lim t→∞ t · sin θs + [c · tan θe sin θs] + d (6) The terms in square brackets can be dropped since they are negligible as t approaches infinity if the common trigonometric factor is nonzero, or they are zero if the common trigonometric factor is zero. And the projected camera coordinates of the vanishing point for the side face: xs = −c · tan θe · cos θs (7) ys = 0 (8) zs = c · tan θe · sin θs + d (9) We then perform perspective projection with focal length fs resulting in Equations 7–10 in the main document. 2 Experimental Design 2.1 Stimulus Generation The stimulus images were created using the PBRT offline renderer [Pharr and Humphreys 2010] in two passes: 1. The capture camera created a wide-angle frontoparallel image of the three different realistically-dimensioned façade designs from a distance of 40m. The capture camera had a field of view of 107°×20° and a very high resolution of 17200×2224 pixels to avoid noticeable aliasing artifacts in the next pass. 2. The captured image was projectively textured onto a single plane with a slant angle θs of 0°,±15° or±30° with respect to the optical axis of the simulation camera. The simulation camera was pointed straight at a corner at a distance of 40m and had a field of view of 21°×16° and a resolution of 2048×1536 pixels. This resolution was downsampled for display on different devices. Both renderpasses used high-quality antialiasing and EWA texture filtering [Greene and Heckbert 1986] to ensure artifact-free final images. 2.2 Display Setup We wrote our experiments in MATLAB, using the Psychophysics Toolbox [Kleiner et al. 2007] to display stimuli and collect keyboard input. The hinge device used a potentiometer to record the angle between the sides. The TV can be treated like a normal PC monitor Display Diag. Image diag. View. distance retinal factor Phone 3.5” 67.4 mm 341 mm 2.37 Tablet 9.7” 207 mm 522 mm 1.18 PC 24” 458 mm 848 mm 0.87 TV 55” 920 mm 1450 mm 0.74 Table 1: The four display conditions used in our experiments. Columns are: the screen diagonal, the diagonal of the stimulus image, the corresponding preferred viewing distance [Cooper et al. 2012] and retinal minification factor v/(M · fs). as it can be connected directly to a computer. For the phone and the tablet, we instead use VNC to relay the display content from the computer over a wireless network. On these devices, we also use a bluetooth keyboard for input. 2.3 Screening of Participants The hinge angle-matching task had precise instructions on how to align the hinge device with the façade and reproduce the angle. The rating task requires participants to use the same internal decisionmaking process to provide self-consistent ratings. Both tasks involve response mapping between the observed angle in the displayed stimulus and either the observed angle of the physical hinge device or the 5-point rating. Some participants are known to have difficulty performing such response mapping tasks [Watt et al. 2005]. Combined with the large number of conditions in this within-subject design, performing both tasks for extended periods requires significant effort and motivation from the participants. It is accepted practice to verify in advance that participants will be able to perform such an extensive experiment, without biasing the result. Therefore participants were screened using short versions of the experiments (112 trials). We excluded 3 participants: one for not following instructions precisely on setting the hinge, one for rating the same stimulus at two extremes of the scale 3 times each, and one who had multiple inconsistent responses to the same hinge stimulus. 3 Experiment 1: Hinge Angle Matching 3.1 Additional Results Figures Figure 1 shows the results of the hinge angle-matching experiment for each of the 4 display devices. There are no significant differences visible. Figure 2 shows the results of the hinge angle-matching experiment for the 6 participants separately. There are differences between participants, but the repeated measures design allows us to study the effects of interest while keeping the variability due to individual differences low. Many more plots are available on the project web page. 3.2 Analysis of Variance Table 2 shows the ANOVA table for the factors display device, simulation angle, eccentricity angle, façade depth and façade design. Compared to the ANOVA in the main document that omitted the façade design factor, all significant effects remain significant but with a potentially reduced effect size. The main effect of façade depth and the interaction between simulation and eccentricity angle are now reduced to medium-large effect sizes (0.13 < η G < 0.26). There are no additional statistically significant effects of façade design with appreciable effect sizes. 4 Experiment 2: Angle Rating 4.1 Additional Results Figures Figure 3 shows the results of the angle-rating experiment for each of the 4 display devices. There are no significant differences visible. Figure 4 shows the results of the angle-rating experiment for the 6 participants separately. There are differences between participants as can be expected in rating tasks, but the pattern that stands out across all participants is the blue diagonal of angles rated close to a right angle. Source df SS MS F p Sig η G Display Device (Dev) 3 10716 3572 7.939 0.0021 ⋆ 0.051 Simulation Angle (Sim) 4 2333 583.2 7.102 0.000995 ⋆ 0.011 Eccentricity Angle (Ecc) 3 168059 56020 50.05 4.78e-08 ⋆ 0.456 Façade Depth (Dep) 2 56364 28182 61.84 2.34e-06 ⋆ 0.219 Façade Design (Faç) 2 2647 1324 0.773 0.487 0.013 Dev×Sim 12 168.9 14.07 0.551 0.872 0.001 Dev×Ecc 9 1387 154.2 1.362 0.234 0.007 Dev×Dep 6 626.9 104.49 1.647 0.169 0.003 Dev×Faç 6 1655 275.9 2.038 0.0912 0.008 Sim×Ecc 12 52016 4335 11.39 1.92e-11 ⋆ 0.206 Sim×Dep 8 332.8 41.60 1.228 0.308 0.002 Sim×Faç 8 699.1 87.39 2.067 0.0626 0.003 Ecc×Dep 6 7783 1297 9.135 1.03e-05 ⋆ 0.037 Ecc×Faç 6 9315 1552.5 4.176 0.00362 ⋆ 0.044 Dep×Faç 4 6121 1530.2 8.003 0.000506 ⋆ 0.030 Dev×Sim×Ecc 36 2050 56.93 2.142 0.000592 ⋆ 0.010 Dev×Sim×Dep 24 524.2 21.84 0.884 0.622 0.003 Dev×Sim×Faç 24 629.1 26.21 1.125 0.328 0.003 Dev×Ecc×Dep 18 1285 71.39 1.933 0.0225 ⋆ 0.006 Dev×Ecc×Faç 18 850 47.24 1.163 0.309 0.004 Dev×Dep×Faç 12 232.3 19.36 0.662 0.781 0.001 Sim×Ecc×Dep 24 1308 54.50 1.745 0.0268 ⋆ 0.006 Sim×Ecc×Faç 24 6663 277.61 2.957 5.29e-05 ⋆ 0.032 Sim×Dep×Faç 16 455.2 28.45 1.022 0.443 0.002 Ecc×Dep×Faç 12 4432 369.3 5.123 8.19e-06 ⋆ 0.022 Dev×Sim×Ecc×Dep 72 2354 32.70 1.224 0.121 0.012 Dev×Sim×Ecc×Faç 72 2125 29.52 1.096 0.292 0.010 Dev×Sim×Dep×Faç 48 1291 26.89 1.088 0.333 0.006 Dev×Ecc×Dep×Faç 36 1502 41.71 1.549 0.0336 ⋆ 0.007 Sim×Ecc×Dep×Faç 48 1906 39.71 1.382 0.0613 0.009 Dev×Sim×Ecc×Dep×Faç 144 3236 22.48 0.884 0.818 0.016 Table 2: Results of our repeated-measures ANOVA on the hinge data. The columns list: sources of variance, their degrees of freedom (df), sum of squares (SS), mean square (MS), F -statistic, p-value, significance code for α = 0.05, and generalized η effect size. Many more plots are available on the project web page. 4.2 Follow-up Experiment Figure 5 shows the results of the follow-up experiment with real stimuli for both the hinge angle-matching task (a) and the rating task (b). The results are qualitatively similar to the results of the main experiments with synthetic stimuli, but overall larger ranges of angles and ratings were obtained. 5 A Predictive Model for Perspective Distortion in Street-level IBR 5.1 Flattening of Perceived Angles The retinal hypothesis (Equations 12–16 in the main document) predicts different results for the different devices because the viewing distances relative to the COP differs across devices. However, Figure 1 did not reveal systematic differences across devices. Our explanation for the lack of a device effect is that the effect of distance from the COP is overshadowed by the overall compression of responses towards 90°, likely due to the familiarity with 90° balcony shapes such as the ones in our stimuli [Yang and Kubovy 1999, Perkins 1972]. For this reason, we assumed in Figure 7(a) in the main document a modified viewing distance v′ = 1, i.e., the viewer is at the COP. Additional to compensation of viewing distance to the COP distance, an overall flattening effect independent of the stimulus parameters can occur because binocular viewing of flat, non-stereoscopic display surfaces provides binocular depth cues in conflict with the pictorial cues from converging parallel lines [Yang and Kubovy 1999]. This −30 −15 0 15 30 P er ce iv ed A n g le [° ] Phone −30 −15 0 15 30 Tablet
منابع مشابه
Low-complexity Image-based 3D Gaming
In this paper we present a low-complexity capturing and real-time rendering technique for interactive and photorealistic immersive gaming applications using 3D anaglyph vision. In contrast to conventional geometry-based rendering engines, this approach uses image-based techniques to guarantee high quality graphics that can be achieved at low computational costs. In combination with explicit dep...
متن کاملThe Effects of HMD Attributes, Different Display and Scene Characteristics on Human Visual Perception of Region Warping Distortions
—The Address Recalculation Pipeline (ARP) is a graphics display architecture that was designed to reduce user head rotational latency in immersive Head Mounted Display (HMD) virtual reality. A demand driven rendering technique known as priority rendering was devised for use in conjunction with the ARP system to reduce the overall rendering load. Region warping was devised along with several oth...
متن کاملHuman Visual Perception of Materials in Realistic Computer Graphics
Materials are an essential component of realistic three-dimensional (3D) computer graphics. Each material model offers a set of parameters to control every aspect of its appearance. Choosing the right material parameters to obtain a desired look requires some experience. The appearance of a material is also affected by the shape of the object. We present a broad exploratory study of the influen...
متن کاملSegmentation Assisted Object Distinction for Direct Volume Rendering
Ray Casting is a direct volume rendering technique for visualizing 3D arrays of sampled data. It has vital applications in medical and biological imaging. Nevertheless, it is inherently open to cluttered classification results. It suffers from overlapping transfer function values and lacks a sufficiently powerful voxel parsing mechanism for object distinction. In this work, we are proposing an ...
متن کاملMulti-Perspective Images for Visualisation
This paper describes the concept, and previous realisations, of multi-perspective images in nature, art and visualisation. By showing how distortions have been used for visualisation, it motivates the use of multi-perspective images, which are similar in effect to object based distortions. A new API being developed to facilitate multi-perspective rendering is presented, with particular referenc...
متن کاملKnowledge Discovery and Management Laboratory
Submitted to Visual Information Processing Workshop 2001 Abstract This paper describes the concept, and previous realisations, of multi-perspective images in nature, art and visualisation. By showing how distortions have been used for visualisation, it motivates the use of multiperspective images, which are similar in effect to object based distortions. A new API being developed to facilitate m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013